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FAST METHOD FOR THE FORWARD AND INVERSE MDCT IN AUDIO CODING 



CROSS-REFERENCE TO RELATED APPLICATIONS 

This patent application claims the benefit of the filing date of United States Provisional 
5 Patent Application Serial Nos. 60/181,271, filed February 9, 2000 and entitled "Fast Method 
/ for the Forward and Inverse MDCT in Audio Coding"; and 60/184,685, filed February 24, 
2000 and entitled "Fast Method for the Forward and Inverse MDCT in Audio Coding", the 
entire contents of which are hereby expressly incorporated by reference. 



1 0 FIELD OF THE INVENTION 

The present invention relates to audio and image data compression and decompression 
5 applications. More specifically, the invention relates to a system and method for fast 
^ computation of modified discrete cosine transform (MDCT) and its inverse modified discrete 
Q cosine transform (IMDCT). 

^ n BACKGROUND OF THE INVENTION 

□ Discrete cosine transform (DCT) is used extensively in data compression for image and 

speech signals. For example, the authors in N. Amed, T. Natarajan, and K. R. Rao, "Discrete 
Cosine Transform," IEEE Trans. Commun., vol. COM-23, pp. 90-93, Jan. 1974 [1]; and Man 

il 20 IK Chao and Sang UK Lee, "DCT methods for VLSI parallel Implementations," IEEE 
Trans, On Acoustics, Speech and Signal Processing, vol. 38, no. 1, pp. 121-127, January 
1990 [2], the contents of which are hereby incorporated by reference, describe DCT methods 
for data compression. Many methods have been proposed to compute the DCT. For 
example, see Nam IR Cho and Sang UK Lee, "DCT Methods for VLSI Parallel 
25 Implementations, " IEEE Transactions on Acoustics, Speech and Signal Processing, vol.38, 
no.l, pp.121-127, January 1990 [11]; and T. K. Truong, I. S. Reed, I. S. Hsu, H. C. Shyu, and 
H. M. Sho, "A Pipelined Design of a Fast Prime Factor DFT on a Finite Field," IEEE Trans. 
Computers, vol. 37, pp. 266-273, Mar. 1988 [12], the contents of which are hereby 
incorporated by reference. 
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Recently, Kek in C. W. Kok, "Fast Method for Computing Discrete Cosine Transform," 

IEEE Trans, on Signal Processing, vol. 45, no. 3, pp. 757-760, March 1997 [3], the contents 

of which are hereby incorporated by reference, suggested a new DCT method for computing 

the DCT of length N - N\>^ Ni, where N\ is an even number and Ni is an odd number. In 
5 this method, the DCT with even length is decomposed into two DCT's with length N 12, 

If N 12 is an odd number. 

Further, it follows from Michael T. Heideman, "Computation of an Odd Length DCT 

from a Real-Value DFT of the Same Length," IEEE trans, on Signal Processing, vol. 40, no. 

1, pp. 54-61, January 1992 [4], the contents of which are hereby incorporated by reference, 
10 that such a DCT of length N 12 can be computed by the use of the identical length discrete 

Fourier transform (DFT) method developed by Winograd. This method is described in Dean 
\u R Kolba and Thomas W. Parks, "A Prime Factor FFT Method Using High-Speed 

Convolution," IEEE Trans on Acoustics, Speech and Signal processing, vol. ASSP-25, no. 4, 
g pp. 281-294, August 1977 [5]; and S. Winogard, "On Computing the Discrete Fourier 
,y 15 Transform," Mathematics of Computation, vol. 32, no. 141, pp. 175-199, January 1978. [6], 

the contents of which are hereby incorporated by reference. 
Q By using the ideas in [3, and 4], a fast method can be developed to compute an 18-point 

Q DCT. In other words, the standard 18-point DCT is decomposed first into two 9-point DCT. 
;2 Then, such a 9-point DCT can be implemented by Winograd's DFT method. However, 
^ 20 computing the MDCT and its IMDCT of an even length sequence can involve an extensive 

computation. 

Therefore, there is a need for a fast method to compute either the MDCT or IMDCT so 
that it can be implemented easily on either a processor, such as a digital signal processor 
(DSP) or computer. Most recently, the author in K. Konstantinides, "Fast Subband Filtering 
25 in MPEG Audio Coding," IEEE Signal Processing Letters, vol. 1, no. 2, pp. 26-28, Feb. 
1994 [7], the contents of which are hereby incorporated by reference, suggested that the 
MDCT and IMDCT in MPEG audio coding can be converted first into the standard DCT and 
IDCT, respectively. Such a DCT or IDCT can be computed rapidly by the use of Lee's fast 
method in Byeong Gi Lee, "A New Method to Compute the Discrete Cosine Transform," 
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IEEE Trans on Acoustics. Speech and Signal processings vol. ASSP-32, no. 6, pp. 1243- 
1245, Dec. 1984 [8], the contents of which are hereby mcorporated by reference. 

SUMMARY OF THE INVENTION 

The modified discrete cosine transform (MDCT) and its inverse modified discrete 
cosine transform (IMDCT) can involve an extensive computation, for example, in layer III of 
the MPEG audio coding standard. The present invention computes an 18-point DCT in a 
fast and efficient manner. Furthermore, using this 18-point DCT method, two new methods 
are developed to compute the MDCT and its IMDCT, respectively. The number of 
multiplications and additions needed to implement both of these two new methods are 
reduced substantially. 

In one aspect, the invention describes a method performed by a computer for computing 
modified discrete cosine transfer comprising the steps of: 

/^x [[-X26-*)-K27 + A)]-A^ for 0<Jfc<8 
computmg x(k) = \ ^ \ yj * 

\[y{k-9)-y{26-k)yb, for 9^Jt<17" 

17 

computing r{«) = ^x(A:)cos[— (2* + l)rt] for 0^/i<17; 

36 

defining 7(0) = Y\0)I2 ; and computing V{n) = YXn) - - 1) for 1 S w < 1 7 . 

In another aspect, the invention describes an MPEG encoder/decoder comprising: 

means for computing jc(Jl:) = ^ "^^ * - 

\[y(lc-9)-y{26-k)lb, for 9<k<l7'' 

means for computing K'(w) = ^a:(*)cos[— (2)5: + l)«] for 0<w^l7; 

A-o 36 

means for defining 7(0) = 7'(0)/2 ; and means for computing Y{n) = Y'(n)-Y(n-l) for 

lS/7^17. 
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The encoder/decoder may also comprise of: means for computing r '(it) = Y{k) ■ for 



0<A:<17 ; 



17 

means for computing y'in) = ^Y\k) cos[-^ (2k + l)n] for 0 ^ « S 1 7 ; 



means for computing y'(n) = •« 



y(« + 9) for 

0 for 

-y''(27-n) for 

-y-in -27) for 



0</i<8 

n = 9 

10<n^26 ' 
27£rt<35 



IS-l 



5 means for defining ^(O) = J^Y^k) -c^ ; and means for computing y(n) = y'{n) - y{n - 1) 



;i; BRIEF DESCRIPTION OF THE DRAWINGS 

The objects, advantages and features of this invention will become more apparent from 
p 10 a consideration of the following detailed description and the dravnngs, in which: 
JiJ FIG. 1 is an exemplary hardware butterfly diagram for a fast 18-18 DCT method; 

;fl FIG. 2 is an exemplary hardware butterfly diagram for a fast 36-18 MDCT method; and 

jl FIG. 3 is an exemplary hardware butterfly diagram for fast 1 8-36 IMDCT described. 

15 DEATILED DESCRIPTION 

TTie system and method of the present invention computes the MDCT and its IMDCT, 
respectively with a substantially reduced number of multiplications and additions. 
Additionally, a computer simulation shows that the speed of these two new methods for 
computing the MDCT and IMDCT are 7.2 times and 4.3 times faster dian the direct 

20 methods, respectively. The present invention significantly improves the performance and 
effectiveness of data compression methods such as, MP III, MPEG, and other audio/video 
compression methods. By this means, the MDCT or IMDCT defined in Hwang-Cheng 



for l</7<35. 
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Chiang and Jie-Chemg Liu, "Regressive Implementation for the Forward and Inverse MDCT 
in MPEG Audio Coding, " IEEE Signal Processing Letters, vol.3, no.45 pp. 11 6- 11 7, April 
1996 [10], the contents of which are hereby incorporated by reference, can be converted first 
into the standard DCT of length iV = 1 8 . Then such an 1 8-point DCT can be implemented 
5 by using the same length Winograd's DFT method. 

The advantage of this new method over the previous methods is that the 1 8-point DCT 
can be used to compute both the MDCT and IMDCT simultaneously. As a consequence, the 
computation of the 1 8-point IDCT needed to perform the MDCT method developed in Hsieh 
S. Hou, "A Fast' Recursive Method for Computing the Discrete Cosine Transform," IEEE 

10 Trans on Acoustics, Speech and Signal processing, vol. ASSP-35, no. 10, pp. 1455-1461, 
Oct. 1987 [9], the contents of which are hereby incorporated by reference, is completely 
avoided in this new MDCT method. With these new techniques, the new MDCT and 
IMDCT methods require only a small fraction of the number of multiplications and additions 
that are required in direct methods, respectively. 

15 Fast Method for Computing the 1 8-point DCT 

Let x{k) be a sequence with length = 18 . The DCT of x{k) defined in [3] is given by 



In (1), X{n) can be decomposed into the even and odd indexed output of the DCT. That is, 




for 0<n<n 



(1) 



A{n) = X{2n) 



for 0 < < 8 



(2) 



B(n)^X{2n-^\) 



for 0 < < 8 



(3) 



By [3], (2) and (3) can be shown to be two DCT's with length 9 as fellows: 




for 0 < A2 < 8 



(4) 



where 
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• # 



aik) = xik) + xilS-\-k) (5) 

and 



9-1 



BXn) = yb{k)cos[—(2k + \)n] for 0<«<8 (6) 

t:i 2x9 



where 



b{k) = 2[xik) - x(l 8 - 1 - k)] cos[— ^ (2k + 1)] (7) 

2x18 



B\n) = B(n) + B(n-l) for 0<«<8 W 

Note that 5(0) = B(-\) . From (8), one obtains 

Bm-^ (9) 



Using the sequence obtained by (6), for 0 < « < 8 can be obtained by the use 

Q 5 of both (9) and the recursive form in (8). 

U Since A(n) in (4) is a 9-point DCT, then, it is shown in [5] that this DCT can be 

. ^ 

'i=-* computed by the use of the identical-length DFT method developed by Winograd [5,6]. To 
illustrate this, first, the decomposition of (4) into two even and odd values of n of the DCT 
appears as follows: 



9-1 



A(2n) = ya(k)cos[-^i2k'h\)2n] for 0<n<4 (10) 

2x9 



10 and 



9-1 



^(9-2«) = ya(ifc)cos[^— (2A: + l)(9-2«)] for 0<n<4 (11) 

2x9 
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It follows from [4] that (10) and (11) can be shown to the following: 

A(2n) = (-l)"Re{Y,a'(k)o)"'} (12) 

and 

A(9 - 2«) = (- 1)"" Im {g aXk) o)'"} (13) 



— th 

where co = e ^ is the 9 root of unity and a^k) is defined by 

,,,, U(-ir'k + 4) for 0<k<4 

[a{i-iy^'i9~k)-^4) for 5<k<S (15) 

Define the 9-point DFT as follows: 

AXn) = f^aXk)a)"' (16) 



5 It is shown in [4] that the transform in (16) can be computed with a minimum number of 

operations by Winograd's method. The substitution of (16) into (14) and (13) yields 
A(2n) = {-\yRc{A\n)} for 0<«<4 (17) 

^(9-2«) = (-l)""'lm{^'(«)} for 5<M<8 (18) 

where A'(n) is given in (16). 

The detailed method given in [5] for computing the 9-point DCT in (4) is given in 

Appendix A. One observes from this method that the number of multiplications and 
10 additions are 10 and 34, respectively. In a similar fashion, by replacing a(k) and A(n) by 

b(k) and B'(n) respectively, (6) can also be computed by the use of the method given in 

Appendix A. 
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Let x{k) be a sequence of length 18. The fast method for computing the 18-point DCT 
defined in (1) comprises the following three steps: 

(1). Compute a(k) and b(k) from (5) and (7). That is, 



a{k) = x(k)'hx(lS-l-k) for 0<A:<8 



6()t) = 2(jc(/:)-x(18-l-A:))cos[— (2A: + 1)] for 0<it<8, 

36 



(2). Use the fast method given in Appendix A to compute the 9-point DCT of a{k) , i.e. 
Ain) in (4) and the 9-point DCT of b(k) , i.e., B' (n) in (6) for 0 < n < 8 . 
^ (3). Use both the recursive formulae for 1 < n < 8 in (8) and (9) to compute B(n) from 

« the known for 0<rt<8. That is, 5(0) = 5'(0)/2 , and 

P 10 B(n) = B'in) -Bin -I) for 1 < « < 8 . Then set X(2n) = A(n) , for 0 ^ w < 8 And 

^ X(2n + 1) = B(n) for 0 < « < 8 , where X(n) is a 18-point DCT of x(k) given 

I in(l). 
n 

From the method described above, it can be shown that the total number of 
15 multiplications and additions needed to compute (1) are 29 and 97, respectively. In contrast, 
324 multiplications and 306 additions are required for a direct computation of (1). An 
exemplary butterfly diagram of a fast 18-18 DCT is depicted in FIG. 1. FIG. 1 shows an 
exemplary hardware butterfly diagram for the fast 18-18 DCT method. The button diagrams 
indicate the computation flows. The pre-computed constants are 



20 =2cos[—(2k -¥])] for 0<A:<8. 

36 
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Fast method for Computing the MDCT 

Let >'(A:) be a sequence of length = 36, the MDCT of y(k) defined in [10] is given by 



36-1 



7(,2) = yj;(A:)cos[-^(2^ + l + 18)(2« + l)] for 0<w<17 
k^o 2*36 



(19) 



rnsteadof computing (19), an alternating technique is employed. To see this, replacing k by 
36 - 1 - ^ in (19), one obtains 



35 



Y(n) = J; ;;(36 - 1 - ^) cos[^((2 * 36)(2« + 1) - (2^ + 1 - 1 8)(2« + 1))] 



(20) 



k=0 



5 But 



'^o^tT^((2 * 36(2« + 1) - (2^ + 1 - 18)(2« + 1))] = -cos[-|-(2A: + 1 - 18)(2n + 1)] 

1 ^ <f\ Z JO 



2*36 
Thus, (19) becomes 



35 



r(«) = _yj;(36-l-A:)cos[^^(2^ + l-18)(2« + l)] for 0<«<17 (21) 
^ 2*36 



k=0 



If 



then 



r(«) = 7(«) + y(/7"l) for 0<A7<17 



(22) 



35 



•(«) = \k )cos[-^2k + 1 - 18)(2/i)] 
to 2*36 



(23a) 



10 where. 



y'(k) = -2y{36-l-k)cos[ (2A:+1-18)] for 0<k<35 

2*36 



(23b) 
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The substitution of (21) into (22) yields, 



36-1 



. rin) = -Y,y(^6-\-k) cos[-|- (2^ + 1-1 8)(2« + 1)] 



k=0 
36-1 



k=0 
35 



k=0 



4 


*36 








*36 




^ ( 




*36 







It - " (24) 



(24) can be shown as 



Y\n) = -f2y{-i6-\ - ^)cos[-^(2A: + 1 - 18)]cos[-^(2^ + 1 - 18)(2«)] 

TTo 2*36 2*36 ^25) 

= |;yWcos[:^(2^ + l-18)«] 



where y\k) is given in (23 b). 
O 5 From (21), one observes that 7(0) = ^(-1) . Using this result and the recursive formula 

in (22), one yields 

7(0) = r(0)/2 (26a) 
Y{n)^Y\n)-Y{n^\) for l<n<17 (26b) 

Using the sequence y\n) obtained by (25b), the sequence y{n) for 0<rt<17 can be 

obtained by the use of (26a) and (26b). 

Let 
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\yik + 9) for 0<A:<26 (27a) 
v" (k) = < 

[y'ik -27) for 27 < A: < 35 (27b) 



Then 



Y'(n) = ty"(k)cos[^(2k + l)n] (28) 

Jt=o Jo 



The proof of this is similar to that used in Lemma 1 in [7]. To prove this, (25) can be 
rewritten as: 



r(n) (k)cos[^(2k + 1-1 8)n] Wcos[:^(2^ + 1 - 18>] (29) 
*=o 36 t=9 36 



,n Let k = k'+9 , (29) becomes 



y'(rt)= E y(^'+9)cos[:^(2^'+l)»]+|:y(/r'+9)cos[:^(2A:'+l)/i] (30) 

jf=-9 JO k=0 JO 



5 Also, let A:'+9 -k-21 in the first summation and let k'^k in the second summation of (30). 
Then 



Y\n) = £y(A:-27)cos[— ((2A: + l)w-(2x36)w)]+f]y(A: + 9)cos[— (2A: + 1)«] (31) 
k=7i 36 36 



But 



cos[— ((2A: + \)n - (2 x 36)«)] = cos[— (2^ + \)n\ 
36 36 



Thus, (31) becomes 



n 



Y\n) = ^;;»cos[— (2^ + \)n] (32) 



where >'''(A:) is given in (27a) and (27b). 
10 For given y\k) as defined in (27a) and (27b), 
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Xin) = Y'in) = Y,xik) cos[— (2k + 1)«] for 0 < « < 1 7 

36 



(33) 



k=0 



where 



x(k) = y\k) -f /(36 -\-k) for 0<^<1 (34) 
The proof of this is also similar to that used in Lemma 2 in [7]. To see this, (32) can be 
rewritten as 



Y'(n) = Y,'(n)-\-Y^'in) for 0<n<l 



(35) 



where 



Y,'in) = '^y"(k)cos[^i2k + l)n] 
^ 36 



A=0 



(36) 



5 and 



35 



Y,\n)=Y^y"{k)cos[^i2k + \)n] 

Jo 



Let = 36 - 1 - A^' . Then ' {n) in (37) becomes 



(37) 



y2 • («) = i; y (36 - 1 - ) cos[:^ (2(36 - 1 - ) + \)n] 
^ 36 



(38) 



However, 



cos — (2 X 36« - {2k' + l)rt) - cos — (2A:' + \)n 
36 36 



snce, let A:'= /: , (38) becomes 



^2 = £ y (36 - 1 - )t') cos — (2 it' + 1)/7 

36 



*'=0 



The substitution (39) and (36) into (35) yields 



(39) 
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'(«) - "(^ )<=os[:^(2^ + + "(36 - 1 - A: )cos[;^(2A: + 1)«] 

• 36 



17 



(40) 



= Y,xik)cos[—(,2k +l)n] for0<«<17 
*=o 36 



where x(k) is given in (33). From the above, it can be seen that (33) is an 18-point DCT 
which can be efficiently computed by the use of the fast method given in the previous 
section. 



Suppose that y{k) is a sequence of length N = 36. For 0<A:<8, one has 
27 < 36 - 1 - )t < 35 . Thus, from (27) and (23b), (34) becomes 



x{k) = y"ik) + y''(36-\-k) 
= y'{k + 9) + y'(S-k) 

= -2;;(26 - k) cos[-^ (2k + 1)] - 2:v(27 + k) cos[-^ (2k + 1)] 
2x36 2x36 

= [-y(26 -k)- y(21 + it)] • 2cos[-^(2A: + 1)] 

2x36 



Similar, for 9 < ^ < 17 , (34) becomes 



x(k) = y'(k + 9) + y'(AA-k) 

= -2:^(26 - k)cos[-^(2k + 1)] - 2y(k - 9)cos[— ^(71 - 2k)] 
2x36 2x36 

= {y(k -9)-y(26-k)\-2 cos[^ (2k + 1)] 



since cos[-^— (71 - 2A:)] = -cos[-^— (2A: + 1)] , 
2x36 2x36 



Let b^=2 cos[ 



be the pre-computed constants. The fast method for 



computing the MDCT of y{k) in (19) is composed of the following three successive steps of 
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computation: 

(1) . Compute 

([-y(26-k)-y(27^k)]-b, for 0<^<8 
[[y(k-9)-yi26-k)]'b, for 9<A:<17' 

(2) . Use the fast method described in Section II to compute the 18-point DCT of x(k), 

i.e., YXn) defined in (40). That is,; 



17 



y • (n) = £ x(k) cos[— (2k + l)n] ■ for 



0<^ <17 I 
(3). Let 7(0) = 7'(0)/2. 

Then compute Y(n) = YXn) - - 1) for 1 < « < 17 . 

1 0 This new method is simpler and faster than that of the brute-force method. The number 

of multiplications and additions needed for this method and the brute-force method is given 
in Table 1 . Table 1 shows that the multiplications and additions needed to implement this 
new method require only 10.03% and 20.95% of the brute-force method, respectively. The 
butterfly diagram of the fast 36-18 MDCT is depicted in FIG. 2. FIG. 2 illustrates an 

15 exemplary hardware butterfly diagram for the above fast 36-18 MDCT method. The button 
diagram indicates the computation flow. The pre-computed constants are 



6,=2cos[-^(2^ + l)]. 
2 X 36 



Fast Method for Computing the IMDCT 

Let Y(k) for 0<A:<17 be the output sequence of (19). The inverse MDCT of Y{k) 
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defined in [10] is given by 



18-1 



y(rt) = Y,Y(k)cos[-^(2n + l-¥\S)i2k^-\)] for 0<rt<35 (41) 
k^o 2*36 



If y(n) = y{n) + y{n - 1), using a proof similar to that used in (24)-(26), then (41) can be 
expressed by 



y(«)^yr()fc)cos[-^(2« + 18)(2A: + l)] for 0<n<35 (42) 
k^o 2*36 



where 



r(A:) = 2r(A:)cos[--^(2A: + l)] for 0<A:<17 (43) 
2* 36 



5 It follows from (4 1 ) that 



18-1 



:^;(0) = |;y(A:)cos[-|-(19)(2^ + 1)] (44) 
2*36 



Using the sequence y' («) obtained by (42), again, y(n) can be obtained by the use of the 
initial condition y(0) given in (44) and the recursive formula y(n) = y'{n)- y(n-l) for 
l<n<35. 
Define 



h'(n-9) for 9<«<35 
y (n) = < 

\y'in + 27) for 0<n<8 



(45) 
(46) 



10 From (42), (45) and (46) can be shown to the following 



y (n) = y(«- 9) = yr(^)cos[-^(2^ + l)2A7] for 9 < « < 35 (47) 

k=o 2 * 36 
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y'(«) = y(« + 27) = yr(^)cos[— ^(2« + 72)(2fc + l)] for 0<«<8 (48) 

2*36 



In (47), it is can be shown that y" (18) = 0 . Based on [7], the following can be proved: 
In (47) and (48), 



y'(i8 + y) = -y(i8-y) for i<7<9 



y(i8 + ;) = y(i8-y) 



for 10<y<17 



Thus, for 1^7^ 9 , from (47), one has 



/(1 8 + y) = X Y' ik) cos[-|- • 2(1 8 + j)(2k + 1)] 
" 2*36 



Jfc=0 
18-1 



= X r • (A:) cos[^ (2 * 1 8)(2^ + 1) + ^ (2;)(2^ + 1)] 



4*18 



5 Using cos(^ + 5) = cos A • cos 5 - sin ^ • sin 5 , (5 1 ) becomes 



18-1 



y\\ 8 - 7) = X YXk) cos[-^ 2(1 8 - j)(2k 4- 1)] 

4*18 



k=0 
18-1 



= Xn*)sin[y(2^ + l)] sin[^(2^ + l)] for 1<7<9 



A=0 



From (52) and (53), ^"(18 + j) = -y"{lS- j) for 1 < y < 9 . 
For 10 < y < 1 7 , from (48), one has 



(49) 



(50) 



(51) 



y\n + j) = -2Y'(k) sin[^ ilk + 1)] • sin[-f(- (Ik + 1)] for 1 < y < 9 (52) 
Similarly, 



(53) 
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y\l S-j) = |;7'(^)cos[-^(2(18 - J) + 12)(2k + 1)] 



17 



2*36 
108;r 



(54) 



Again, using cos(^ -B) = cos A • cos 5 + sin ^4 ■ sin 5 , (54) becomes 



y\\^- j) = Z Y\k) sin[^ {2k + 1) ■ sin ^ (Ik + 1)] 



(55) 



A=0 



However, 



sin^(2/: + 1) = -sin^ 



|(2^ + l)J 



Hence, (55) becomes 



y-il S-j) = -^Zy'ik) sin[| (Ik + 1)] • sin[^(2^ + 1)] 



(56) 



From (52) and (54), y\\ 8 + y) = -y"(\ 8 - y) for 1 0 < y < 1 7 . 



It follows from the above proof that one only computes y'in) for 0 < /i < 17 . In other 
words, for given y'(n), where 0<n<17, y\n) for 19<rt<35 can be obtained from 
using (49) and (50). Recall that ^"(18) = 0. 
Also, by defining 

y^in) = -y'in) for 0 < 7 < 8 



y"'(n) = y''in) 



for 9<j<\l 



(57) 
(58) 



Then 



0125 ;^» = gmcos[^(2^ + l)/7i for 0<«<17 
" As a result, for 0 < « < 8 , from (48), one yields 



(59) 
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yXn) = X Y'ik) cos[-^ n{2k + 1) + ;r (2^ + 1)] 
*=o 2*18 

= -Y^Y\k)cos[-^{lk + \)n] 

= -y\x) 

For 9 < « < 1 7 , from (47), one obtains y"{n) = y\n) . 

In (59), y'"in) is an 1 8-point DCT. By replacing y'\n) and Y\k) by Xin) and xik), 
respectively, the 1 8-point DCT in (59) is the same as (1). As a consequence, the fast method 
developed in Section II can also be used to compute (59). 
5 From ( 45) and (46), one has 

fy^n + 9) for 0 < « < 26 
V («) = < 

l/(«-27) for 27<«<35 

Together with (49), (50), (57) and (58), yXn) can be expressed in terms of y^in) as 
follows: 

yXn) = y"(n + 9) = y'^n + 9) for 0 < « < 8 

y'in) = y\n + 9) = y"{\ 8) = 0 for « = 9 

/(«) = /(« + 9) 

= /(18 + («-9)) 

= -y\n-in-9)) forlO</7<18 
= -/(27-«) 
= -r(27-«) 

y\n) = -y'"{21-n) forl9<n<26 
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y'(„) = -y^^n - 27) for 27 < « < 35 

Let Y{k) for 0<^<17 be the output sequence of (19). Also let 



\ h^=2 cos[ (2A: + 1)] 1 be the pre-computed constants which are the same as those 

2x36 I 



\9n 



defined in the MDCT method in the previous section. Finally, let/c^= cos[ "^^"^ (2k + 1)] . / 

2 X 36 / 



The fast method for computing (41) is comprised of the following 4 steps of computation: 
5 (1). Compute 

YXk) = Y{k)-b;^ for 0<k<n 
(2). Use the fast method described in Section II to compute the 1 8-point DCT 
of Y(k) given in (59). That is, 



/'(«) = yrWcos[— ^(2yt + l)«] for 0<«<17 
*=o 2*18 



10 (3). Compute 



yXn) = 



y^in + 9) for 0 < « < 8 

0 for « = 9 

-y^ill-n) for 10<«<26 

-y^n-n) for 27<«<35 



18-1 



(4). hQXy{Qi) = Y,Y{k) c, 



Then compute y{n) = y\ri) - y{n - 1) for 1 < n < 35 . 

It is easy to observe that this new method for computing the IMDCT is simpler and 
15 faster than that of the brute-force method. The number of multiplications and additions 
needed for this fast method and the brute-force method is given Table 1 . Table 1 shows 
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that the number of multiplications and additions required to implement this new method are 
only 7.25% and 10.03%, respectively, of the brute-force method. The butterfly diagram of 
the fast 18-36 IMDCT is depicted in Figure 3. 
Program Implementation and Simulation Results 
5 The MDCT and its IMDCT methods described above can be computed simultaneously 

by using the 18-point DCT. These two new methods are implemented using C-f-+ language 
run on a Pentium 133AVindows 98 platform, in which the multiplications and additions are 
implemented using double data type. An example of the source code is given in Appendix 
B. The fast methods are verified by comparing their results of 10^ computations to those of 

10 the bruit-force methods. The computation times are provided in Table 1, which is in 
millisecond averaged fi*om 10^ computations. These new methods for computing the 
MDCT and IMDCT require 0.0218 ms and 0.0375 ms as compared to 0.1567 ms and 0.1616 
ms required by the brute-force methods, respectively. 

As a consequence, the new methods for computing the MDCT and IMDCT are 7.2 and 

15 4.3 times faster than the brute-force methods, respectively. In addition to audio compression 
such as MP III method, the resulting improvement is also useful for video compression 
methods such as MPEG method. 

Additionally, the method of the present invention may be implemented in a custom 
application-specific integrated circuits (ASICs), general-purpose programmable DSP using 

20 firmware to program the DSP devices, and customizable arrays such as programmable logic 
arrays (PLAs) and field-programmable arrays (FPAs). Because the present invention uses 
the same encoding and decoding (recording and playback), the complexity of software, 
firmware, and hardware is significantly reduced resulting in cost improvement and 
flexibility, in addition to speed improvement. 

25 FIG. 3 is an exemplary hardware butterfly diagram for fast 18-36 IMDCT described. 

The button diagram indicates the computation flow. The pre-computed constants are 

b, =2cos[-^-(2it + l)] and 
2x36 
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18-1 



19;r 



yiO)^Y.^(k)-c, where c, = cos[-— (2^ + 1)] 

2x36 



k=0 



Table I: NUMBER OF MULTIPLICATIONS AND ADDITIONS NEEDED TO 
IMPLEMENT THE MDCT AND IMDCT 





IMDCT 








IMDCT 




|step 1 


Step 2 


Step 3 


Total 


Step 1 


Step 2 


Step 3,4 


Total 


New 


Multipliers jl8 


29 " 


0 


47 


18 


29 


18 


65 


Method 


Adders |l8 


94 


17 


129 


0 


94 


35 


129 


Brute-force 


Multipliers 


648 




648 


method 


Adders | 


630 




612 



Table II: MDCT AND IMDCT EXECUTION TIMES IN MILL! SECOND 





MDCT 


IMDCT 


New method 


0.0218 


0.0375 


Brute-force methods 


0.1567 


0.1616 



It will be recognized by those skilled in the art that various modifications may be made 
to the illustrated and other embodiments of the invention described above, w^ithout departing 
1 0 from the broad inventive scope thereof. It will be understood therefore that the invention is 
not limited to the particular embodiments or arrangements disclosed, but is rather intended to 
cover any changes, adaptations or modifications which are within the scope and spirit of the 
claims in the area of signal processing of audio and video signals including compression of 
audio and video signals. 
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Appendix A: 

This appendix contains the method developed in [5] for computing the 9-point DCT 
given by 



9-1 



A(n) = ^a(k) cos 



k=0 



n 



2x9 



{Ik + \)n 



for 0<n<8 



Method for the 9-point DCT: 

c, = -0.866025, C2 = 0.939693, = -0.173648, = -0.766044 
C5 = 0.5, = -0.342020, c, = -0.984808, Cg = -0.642788 

d, = fl(3) + a(5), = a(3) - a(5), d^ = a(6).+ a(2), = a(6) - a(2) 
i/, = a{X)^a{l\d,=a{y)-a{l),d, = a{%)^ a{^),d^= a{%)- ai^) 

d^ =a(4) + i/5,(i,o =c/| ^d^,d^^ =d^Q^rdT,d^^ =d.^-dT,d^.^ =i/, - dT,dt^ =d^-d^ 
c?,5 =i/2 -<3?4,t/,6 =dti+ds,d„ =d^ +dg,d^g =d^ -d^,d^g =d2+d^ 
w, =Cfd^,m2 =c^d^,m^ =c^d^^,m^ =c^d^2,m^ =C3t/,3 

=C4^14''"7 =C\d\6''^S =Cf,d^T,m^ =C^d^^,m^(, =i^8^^19 
C?20 =a(4)-m2,c/2, =d2o+m^,d22 =^20-^^,^23 =f/20 +'"5 

d24=m^+mi,d2s='n^-mg,d26=m^+mg 

4, =i/, +J,i,Ai =/w,o-i/26'A2 =m^-d2i,Aj =my,A^ =^22-^"^ 
A =^25 -'"9' ^6 = "ij -i^9. A7 =i/24 +m,o,^ =d2i+m^ 



Thus, the 9-point DCT requires only 10 multiplications and 34 additions, respectively. 
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Appendix B: Source Code 

//**************************+************ 

// File name: mdct.cpp 

//****•*-******■***★********* + *****★***** + * + 

#include <iostream. h> 
#include <iomanip.h> 
#include <math.h> 
^include <time.h> 

#pragina hdrstop 
#include <condefs.h> 
#include "dct_lib.h" 

USEUNIT { "dct_lib . cpp" ) ; 

// 

int borgt 100036] ; 
double 

bin [100036] , *ptbin, boutO [ 36] , bout 1 [36] ; 

// 

#pragnia argsused 

int main (int argc, char* argv[]) 

{ 

int Jc,n; 
int i,cnt; 
double diff; 
clock_t clcl,ck2; 

cout« "welcome to mdct test\n"; 
build_costab ( ) ; 

// the random data 

randomizeO; 

f o r ( k= 0 ; lc< 1 0 0 0 3 6 ; lc++ ) 

borg[k] =random(255) ; 

for (k=0;k<100036;k++) bin [ k] = (double) 

borg[k] ; 

// verify the MDCT and the fast methods 

cnt=100000; 

ptbin=bin; 

cout«"Verifying MDCT3618 and MDCT3618F, 
cnt="«cnt<<endl«endl ; 
for (i=0; Kent; i++) 
{ 

MDCT3618 (ptbin,boutO) ; 
MDCT3618F(ptbin,boutl) ; 
ptbin++; 

for (n=0;n<18;n++) 
{ 

diff=( (boutO[n]- 
boutl [n] )<. 0000000001) ?0: 99999; 
if (diff>0) 

cout«dif f«setw(10) «bout0 {n]«setw(10)< 
<boutl [n] «"\n"; 

' } 

} 

// verify the IMDCT and the fast methods 
ptbin=bin; 

cout<<"Verif ying IMDCT1836 and 




IMDCT1836F, cnt="«cnt<<endl«endl ; 
for (i=0; i<cnt;i++) 
{ 

IMDCT1836 (ptbin,boutO) ; 
IMDCT1836F(ptbin,boutl) ; 
ptbin++; 

for (n=0;n<36;n++) 
{ 

diff=( (boutO[n]- 
boutl[n] )<. 0000000001) 70:99999; 
if (diff>0) 

cout«dif f «setw ( 10) <<boutO [n] <<setw ( 10) <<bo 
utl[n]«"\n"; 
) 

} 



cnt=100000; 

// computation time of MDCT3618 
cout« "Computation time of MDCT3618 
ptbin=bin; 
ckl=clock ( ) ; 
for (i=0; i<cnt;i++) 
{ 

MDCT3618 (ptbin, boutO) ; 
ptbin++; 

} 

ck2=clock() ; 

cout«setw(6) « (ck2-ckl) «endi; 



// Computation time of MDCT3618F 
cout<<"Computation time of MDCT3618F 
ptbin=bin; 
ckl=clock () ; 
for (i=0; Kent; i++) 
{ 

MDCT3618F (ptbin, boutl) ; 
ptbin++; 

) 

ck2=clock() ; 

cout«setw(6) « (ck2-ckl) «endl; 



//Computation time of IMDCT1836 
cout«"Computation time of IMDCT1836 
ptbin=bin; 
ckl=clock ( ) ; 
for (i=0; Kent; i++) ■ 
{ 

IMDCT1836 (ptbin, boutO) ; 
ptbin++; 

1 

ck2=cloek() ; 

cout«setw(6) << (ck2-ckl) «endl; 



// Computation time of IMDCT1836F 

cout<<"Computation time of IMDCT1836F : "; 

ptbin=bin; 

ckl=clock ( ) ; 

for (i=0; i<cnt; i++) 
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{ 

IMDCT1836F (ptbin, boutl ) ; 
ptbin++; 

) 

ck2=clock() ; 

cout<<setw (6) << (ck2-ckl) <<encll<<endl; 
return 0; 
} 

//*********★■****♦************************ 
// File name: cict_lib.cpp 

//*************+***********+★★*★*******★* 



#include <iostream. h> 
#include <math,h> 
#pragma hdrstop 

#include "dct_lib.h" 

// 

#pragma package ( smart_init ) 

// some pre-computed consine tables 
double Cos99t9] [9] ; 

double Cos99W[9]; // used for DCT99W 

double Cosl818[18] [18]; 

double Cosl818F[18] ;// DCT1818F 

double Cos3618[183 [36] ; // shared for 

mdct3618, imdctl836 

double Cos3618F[18] ; 

double Cosl836F[18] ; 



// the COS tables for various size 

void build_costab(void) 
[ 

int n,k; 
// DCT99 

for (n=0;n<9;n++) 

for (k-0;k<9;k++) 

Cos99[n] [k]=cos(M_PI/18* (2*k+l) *n) ; 

// for DCT99W 
Cos99W[l]=Cos99(l] [7] ; 
Cos99W[2]=Cos99[23 [0] ; 
Cos99W[3]=Cos99[2] [2] ; 
Cos99W[4]-Cos99[2] [3] ; 
Cos99W[5]=Cos99[2] [1] ; 
Cos99W[6]=Cos99(l] [5] ; 
Cos99W[7]=Cos99(l] [8] ; 
Cos99W[8]=Cos99[l] [6] ; 



// DCT1818 

for (n=0;n<18;n++) 

■ for (k=0;k<18;k++) 

Cosl818[n) [k]=cos (M_PI/3 6* (2*k+l) *n) 

// DCT1818F 



for(k=0;k<18;k++) 

Cosl818F[k]=2*Cosl818[l] [k]; 

// DCT3618 

for (n=0;n<18;n++) 

for {k=0;k<36;k++) 

Cos3618[n] [k] =cos (M_PI /72 * {2*k+19)* (2 

+1)); 

// DCT3618F 
for(k=0;k<18;k++) 

Cos3618F[k]-2*cos (M_PI/72* (2*k+l) ) ; 

// IMDCT1836F 
for(k=0;k<18;k++) 

Cosl836F{k]=cos(M_PI/72*19* (2*k+l) ) ; 

) 

//==™==========™— ==============- 



// 9-9 DCT, standard 

/ /==========^ ================ 

void DCT99 (double *a, double *A) 
[ 

int n, k; 
double s; 

for (n=0;n<9;n++) 
{ 

s=0; 

for {k=0;k<9;k++) 
{ 

s+=a[k] *Cos99[n] [k] ; 

} 

A[n3=s; 
} 

} 



// 9-9 DCT, Winogard 

void DCT99W (double *a, double *A) 
{ 

double d[64],m[64]; 

d[l]=a[31+a[5] ; 
d[2]=a[3]-a[5] ; 
d[3]=a[6]+a[2]; 
d[4]-a[6]-a[2]; 
d[5]=a[l]+a[7] ; 
d[61-a[l]-a[7] ; 
d[7]=a[8]+a[0] ; 
d[8]=a[8]-a[0] ; 

d[ 91=a[ 4]+d[5]; 
d[10]=d[ l]+d[31; 
d[ll]=d[10]+d[7]; 
d[12]=d[ 3]-d[7); 
d[13]-d[ ll-d[7]; 
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d[14 


=dl l]-d[3), 




d[15 


=d[ 2]-d[4], 




d[16 


-dtl5]+d[8] , 




d( 17 


=d[ 4]+d(8], 




d[ 18 


=d[ 2]-d[8], 




d[19 


=d[ 2]+d[4], 




m[ 1 


l=Cos99W[l] *d[ 6]; 


ml 2 


l=Cos99W[5]*d( 5]; 


rn[ 3 


l=Cos99W[5J*d[ll]; 


m[ 4 


)=Cos99W[2]*d[12]; 


m[ 5 


l=Cos99W[3]*dtl3]; 


m[ 6 


l=Cos99W[4] *d[14] ; 


m[ 7 


)=Cos99W[l] *d[16] ; 


m[ 8 


]=Cos99W[6] *d[17] ; 


m[ 9 


]=Cos99W[7]*d[18]; 


m[10 


]=Cos99W[81*d[19]; 


d[20 


]-a[ 4]-m[2] 




d[21 


l=dt20]+m[4] 




d[22 


l=d[20]-in[4] 




d[23 


)=d[201+m[5) 




d[24 


]=in[ ll+mt8] 




d[25 


]=m[ l)-mt8] 




d[26 


]=m[ l]+m[9] 




AtO] 


=d[ 9]+d[lll 




A[ll 


=inllO]-d[263 




A[2]=m{ 6]-dt21] 




A[3) 


=m[ 7]; 




A[4] 


-d[22]-m[ 5] 




A[51 


=d[25]-m[ 9] 




A[6] 


=in[ 3]-dt 9] 




AE7] 


=d[24]+m[10] 




A[81 


=d[23]+m[ 6] 




} 

//== 







// 18-18 DCT, standard 

void DCT18 18 (double *a, double *A) 
{ 

int n,k; 
double s; 



void DCTl818F(double *x, double *X) 
{ 

int n, k; 

double aE9],b[9] ,A[9] ,B[9] ,Bp[9] ; 

for(k=0;k<9;k++) 
( 

a[k]=x(k]+x[18-l-k] ; 
b[k]=(x[k]-x[18-l-k] ) *Cosl818Ftk] ; 
) 

DCT99W{a,A); 
DCT99W(b,Bp) ; 

BtO]=Bp[0]/2; 

for (n=l;n<9;n++) B[n]=Bp[n] -B[n-1] ; 

for (n=0;n<9;n++) 
{ 

X[2*n]-A[nl; 

X(2*n+l]=B[n]; 

} 

} 



================= = 

// 36-18 DCT, standard 

void MDCT3618 (double *a, double *A) 
{ 

int n,k; 
double s; 

for (n=0; n<18;n++) 
{ 

s=0; 

for (k=0;k<36;k++) 
{ 

s+=a[k]*Cos3618[nl [k]; 

} 

A[n] =s; 
) 

}^ 



for ( n=0 ; n< 1 8 ; n++ ) / 

{ // 36-18 DCT, fast 

3=0 ; / ^======================== 

for (k=0;k<18;k++) void MDCT3618F (double *y, double *Y) 

( { 

s+=a[k]*Cosl818(n] [k] ; int n,k; 

} double x[18] , Yp[18] ; 
A[n)=s; 

} for (k=0; k<9;k++) 
} x[k)=(-y[26-kl-y[27+k] ) *Cos3618F[k] ; 

/ 7======:=.========^========================= for ( k=9 ; k< 1 8 ; k++ ) 

x[k]-( ytk-9 ]-y[26-k] )*Cos3618F[k] ; 

DCTl818F(x, Yp) ; 

/ /================^=====================:= Y ( 0 ] =Yp [ 0 ] /2 ; 

// 18-18 DCT, fast for (n=l ; n<18;n++) Y [n] =Yp [n] -Y [n-1 ] ; 
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// 18-36 IMDCT, standard 

void IMDCT1836 (double *Y, double *y) 



#ifndef lib_dctH 
#define lib_dctH 

void build_costab (void) ; 
void DCT99 (double *a, double *A) ; 
void DCT99W (double *a, double *A) ; 
void DCT1818 (double *a, double *A) ; 
void DCT1818F (double *a, double *A) ; 
void MDCT3618 (double *a, double *A) ; 
void MDCT3618F (double *a, double *A) ; 
void IMDCT1836 (double *Y, double *y) ; 
void IMDCT1836F (double *y, double *y) ; 
#endif 



int n, k; 
double s; 



for {n=0;n<36;n++) 



s=0; 

for (k=0;k<18;k++) 



s+=Y[k] *Cos3618 [kl tn] ; //share 
with the same table as 3618 



y [n]=s; 



//= 



// 36-18 DCT, fast 

void IMDCT1836F (double *Y, double *y) 
{ 

int n,k; 

double Yp[18] ,yppp[18] ,yp[36] ,s; 

for(k=0;k<18;k++) 

Yp[k]=Ytkl *Cos3618F[k] ; //the same 
table 

DCTl818F(Yp,yppp) ; 

for(n« 0;n< 9;n++) yptn]= yppp[n+9]; 
for(n= 9;n<10;n++) yp[n]= 0; 
for (n=10;n<27;n++) yp (n) =-yppp [27-n] ; 
for (n=27;n<36;n++) yp [n] =-yppp [n-27 ] ; 

s-0; 

//for (k=0;k<18;k++) s+=Yp[k]; 

for (k=0;k<18;k++) s+=Y[k] *Cosl836Ftk3 ; 

ytO]=s; 

for (n=l;n<36;n++) y tn] =yp(n] -y [n-1] ; 
} 
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